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Analysis  of  Time  Between 


Richard  E.  Barlow  and  Bernard  Davis* 


Abstract . A method  for  analyzing  time  between  failure  data  is 
developed.  The  method  uses  total  time  on  test  plots  for  a non- 
homogeneous  Poisson  process  failure  model.  Engine  failure  data  is 
used  to  illustrate  the  method.  A graphical  method  for  determining 
optimum  replacement  intervals  is  presented. 

K 

1.  Introduction.  Most  of  the  statistical  literature  concerned 
with  analyzing  failure  data  assumes  that  observations  are  independent 
and  identically  distributed.  Although  engineering  reports  often  pur- 
port to  give  Mean  Time  Between  Failure  (MTBF)  estimates,  the  estimates 
are,  in  fact  only  valid  in  general  if  times  between  failures  are 
exponentially  distributed  random  variables.  Since  this  assumption  is 
often  not  valid,  especially  for  mechanical  components,  more  sophis- 
ticated techniques  are  required  to  analyze  this  type  of  data.  To 
focus  on  the  kind  of  problem  we  have  in  mind,  we  will  first  consider 
some  failure  data  on  caterpillar  tractor  engines.  The  actual  data  is 
given  in  the  appendix.  The  data  consists  of  the  age  of  the  tractor 
at  engine  failure,  the  age  of  the  engine  at  failure,  and  the  calendar 
date  of  the  failure  event. 


*Operations  Research  Center,  University  of  California,  Berkeley, 
California  94720.  This  research  was  supported  by  the  Air  Force  Office 
of  Scientific  Research  under  Grant  AFOSR-77-3179. 


Figure  1 shows  engine  failure  removal  times  on  each  of  22  trac- 
tors as  a function  of  tractor  age.  The  large  number  of  failures  at 
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6000  hours  was  due  to  a piston  failure  problem  and  not  a planned 
maintenance  action.  In  order  to  plan  maintenance  actions,  we  need  a 
mathematical  model  for  predicting  engine  failures  as  a function  of 
tractor  age  as  well  as  engine  age.  Sections  2 and  3 present  a tech- 
nique for  solving  this  problem.  In  Section  4 a graphical  method  for 
determining  optimum  component  replacement  based  on  component  life 
cycle  costing  is  presented. 

2.  A Non-homogeneous  Poisson  Model  for  Times  Between  Failures. 
In  examining  59  engine  failures  on  22  D9G-66A  caterpillar  tractors 
we  found  that  engine  age  at  failure  depended  on  the  operating  age  of 
the  tractor  when  the  engine  was  last  placed  in  the  tractor.  Except 
for  the  original  engines  in  new  tractors,  engines  replacing  failed 
engines  were  often  repaired  engines.  Figure  2 is  a plot  of  engine 
age  at  failure  versus  tractor  age  when  the  engine  was  last  placed  in 
the  tractor.  Thus  crosses  on  the  y-axis  corresponding  to  x - 0 are 
ages  at  failure  of  the  original  engines.  It  is  fairly  clear  from 
Figure  2 that  original  engines  tend  to  have  a longer  mean  life  than 
repaired  engines.  The  mean  life  of  original  engines  is  6149  hours 
versus  3241  hours  for  repaired  engines.  The  standard  deviation  in 


I 

I 
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both  cases  is  about  2000  hours. 
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Figure  3 shows  a total  time  on  test  plot  for  original  (new) 
engines  and  also  a plot  for  repaired  engines.  The  fact  that  the  new 
engine  plot  is  more  strongly  concave  indicates  that  the  life  of  re- 
paired engines  is  more  nearly  exponential.  [See  Barlow  and  Campo 
(1975)  for  a discussion  of  total  time  on  test  plots.] 

Analysis  of  Failure  Events  from  Independent  Processes.  Intuitive- 
ly, engine  failure  processes  will  depend  on  both  tractor  age  and  engine 
age.  However,  Figures  2 and  3 suggest  that  tractor  age  may  be  the 
more  significant  variable  in  modelling  the  engine  failure  processes. 

We  assume  that  the  successive  failure  events  of  engines,  say,  in  a 
given  tractor  can  be  described  probabilistically  by  a non-homogeneous 
Poisson  process.  If  N(t)  is  the  number  of  engine  failures  in 
[0,t]  , for  a particular  tractor,  then 

P[N(t)  = k]  = ^ ^ 

for  k = 0,1,2,  ...  where  A(t)  is  the  mean  number  of  engine  fail- 
ures in  [0,t]  . [See  (Jinlar  (1975)  for  an  Introduction  to  non- 
homogeneous  Poisson  processes.]  Since  we  do  not  know  A(t)  , it  must 
be  estimated  from  the  data.  Our  approach  is  to  use  an  appropriate 
total  time  on  test  plot  to  make  a preliminary  model  identification. 

(The  model  is  the  analytic  form  of  A(t)  .)  A final  model  identifica- 


tion will  be  made  using  a Bayesian  approach. 
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t 

TOTAL  TIME  ON  TEST  PLOTS 

D9G-66A  CATERPILLAR  TRACTOR  ENGINES 

• 6149  hours  x„,  , - 3241  hours 

w Old 

“ 2000  hours  S ■ 1842  hours 
lew  *01d 

n - 22  n - 37 

FIGURE  3 
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The  superposition  of  n independent  non-homogeneous  Poisson 
processes  each  with  mean  function,  A(t)  , will  again  be  a non- 
homogeneous  Poisson  process  with  mean  function  nA(t)  . Now  let  each 
process  run  for  the  same  time  interval  [0,T]  . Let 


^a)  - (.2)  - -^(N(T)) 


be  the  ordered  superposed  event  times  on  a common  age  axis,  where  N(T) 
is  the  total  number  of  events  in  [0,T]  . In  Figure  1,  if  all  points 
were  superposed  on  the  x-axis,  the  ordered  values  would  correspond 
to  Z^j's  . (In  our  case,  however,  we  do  not  actually  observe  engine 
failures  over  the  same  tractor  age  interval  [0,T]  for  each  tractor. 

We  make  this  assumption  in  order  to  derive  our  theoretical  result.) 

Let  n(u)  be  the  number  of  processes  under  observation  at  trac- 
tor (or  system)  age  u . In  our  example  (see  Figure  1),  n(0)  = 22 

and  n(u)  = 22  up  to  about  age  6000  hours  at  which  age  it  drops  to 
21,  etc.  Finally,  at  age  u = 22507  hours,  n(u)  = 0 . The  scaled 
total  time  on  test  plot  for  the  non-homogeneous  Poisson  process  model 
is  a plot  of 


'(i) 


/ " 

0_ 

(N(T)) 

/ ” 


(u)du 


versus 


N(T)  ’ 


(u)du 


I 
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for  i = 1,2,  ...,  N(T)  . Figure  4 is  a Total  Time  on  Test  Plot  of 
the  data  in  Appendix  1.  We  have  used  linear  interpolation  to  produce 
a smooth  plot.  For  this  data,  T = 22507  hours. 

The  following  theorem  is  the  basis  for  our  preliminary  model 
identification  procedure. 

Theorem  2.1.  Assume  that  n independent  non-homogeneous 
Poisson  processes  are  observed  on  [0,T]  where  T is  fixed.  Then, 
for  0 ^ p ^ 1 . 

^([PN(T)D  ^ A~^(pA(T)) 
n-xx>  ^(N(T))  A"^(A(T)) 

A~^(pA(T)) 

T 

almost  surely.  [pN(T)]  denotes  the  largest  integer  in  pN(T)  . 

[Note  that  n(u)  = n for  0 u ^ T ; i.e.,  all  processes  are  com-  . 

i 

pletely  observed.]  1 

J 

1 

Proof . Let  Y^,Y2,  . . . , Y^^  , . . . be  independent  identically  ' 

distributed  exponential  random  variables  each  with  unit  mean.  Let 

k j 

S,  = y Y.  and  A (x)  = nA(x)  . Then 
i-1  ^ " 

^([pN(T)])  “ \^(^[pN(T)])) 


Ime 
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where  = means  equal  to  In  distribution.  Also 


'([pN(T)] 


[pN(T)]  1 
[pN(T)] 


3N(T)]  \ 
pn  / 


since  A (x)  = nA(x)  implies 
n 


Now  N(T)  -*■  “ almost  surely  as  n -*■  “ so  that 


[pN(T); 


[pN(T)] 


almost  surely  as  n . Since  EN(T)  = nA(T)  , we  see  that 


almost  surely  as  n . It  follows  that 


^[p»(T)D  •* 


almost  surely  as  n -*■  “> 


If  X(u)  , u > 0 is  the  Intensity  function  for  our  failure 


process  then  A(x)  = / X(u)du  is  the  expected  number  of  failures  in 

0 


[0,x]  . If  X (u)  = aX'^u'^  ^ , then  - — for  0 £ P ^ 1 

In  this  case  ~ — ;j^g  concave  in  0 ^ p ^ 1 for  a > 1 and 
convex  in  0 p ^ 1 for  0 < o < 1 . This  does  not  appear  to  be  a 

good  model  for  the  plot  in  Figure  4.  If 


X (u) 


a-1  -Bu 
u e 


00 


/ 


a-1  - 
w e 


6w 


dw 


then  X (u)  is  a gamma  intensity  function  and  A ^(x)  will  be  approxl 
mately  linear  for  large  values  of  x . For  this  reason,  the  gamma 
intensity  function  with  a > 1 may  be  a better  model  for  the  plot  in 
Figure  4. 

Since  the  plot  in  Figure  4 was  based  on  incomplete  data  (i.e., 
not  all  tractors  were  observed  for  22507  hours).  Theorem  ".I  does  not 
strictly  apply.  However  we  can  make  a valid  comparison  to  a homo- 
geneous Poisson  process  using  the  following  theorem. 

Theorem  2.2.  If  — is  nondecreasing  in  x ^ 0 and 

1 * 

n(x)  ^ — f n(u)du  almost  surely,  then  conditional  on  N(T)  = N 

* 0 


n(u)du 


> U 

^N(T))  7t 


/ ^ 


(u)du 


where  U,  , is  the  i-th  order  statistic  from  N - 1 independent 
i:N-l 

uniform  [0,1]  random  variables,  means  stochastically  greater 

\ St 

or  equal  than.) 

' X 

Note  that  if  X (u)  is  nondecreasing,  then  A(x)  = y X(u)du 

0 

satisfies  the  condition  of  Theorem  2.2.  It  follows  from  Theorem  2.2., 
that  if  the  failure  rate  is  nondecreasing,  then  the  scaled  total  time 
on  test  plot  will  tend  to  lie  above  the  45°  line  |since  * 

Figure  4 indicates  an  increasing  failure  rate  process.  The  distribu- 
tion of  crossings  of  the  scaled  total  time  on  test  plot  in  the  case 
of  a homogeneous  Poisson  model  has  been  derived  by  Bo  Bergman  (1976). 
The  proof  of  Theorem  2.2  is  similar  to  that  of  Theorem  2 in  Barlow  and 
Proschan  (1969)  and  will  not  be  given  here. 


3.  A Bayesian  Approach  to  Model  Identification.  In  Section  2 
we  discussed  a method  for  preliminary  model  identification.  On  the 
basis  of  Figure  4 and  the  discussion  in  Section  2,  we  could  choose  a 

X 

gamma  failure  process  model;  i.e.,  A(x)  » y X(u)du  where 

0 


(3.1) 


X (u)  = 


„a  a-1  -3u 
6 u e 


r „a  a-1 

J 


-gw 
e dw 


The  parameters  a and  3 are  to  be  estimated.  Given  the  model,  the 
likelihood  best  summarizes  the  information  in  the  data  concerning  the 
parameters  [cf.  Basu  (1975)]. 

Let  n(u)  be  the  number  of  systems  (e.g.,  tractors)  under  obser- 
vation at  age  u and  N(T)  = N , the  total  number  of  failures.  The 
likelihood  function,  given  (Z  £^(2)  — ***  — ^(N)^  easily 

found  to  be 


L(a,B  I Z) 


N 


)X(Z^^^)  exp 


n(u) X (u)du 


where  number  of  systems  under  observation  just  prior 

to  the  i-th  observed  failure  and  X (u)  is  given  by  (3.1).  Given  a 
prior  density  TT^(a,B)  , the  posterior  density  on  a , 6 is 


7r(a,B  I ^) 


L(a,6  I ^)iT^(a,e) 


00  oo 


B Z)iT  (a,6)dodB 
— o 


by  Bayes  Theorem.  If  a diffuse  prior;  i.e.,  it  (a,B)  = c is  chosen. 


the  Tr(a,6  | Z)  is  proportional  to  the  likelihood  function.  For 


illustrative  purposes  we  could  graph  L(a,S  | for  the  data  of 

Appendix  1.  The  maximum  likelihood  estimates  can  be  obtained  from 
contour  plots  of  the  likelihood  function. 


4.  Graphical  Determination  of  Optimum  Replacement  Policies. 

For  our  non-homogeneous  Poisson  process  model,  A(t)  is  the  expected 
number  of  component  (engine)  failures  in  [0,t]  . Let  c^  be  the 
average  cost  of  repairing  the  component  and  C2  the  cost  of  buying 
a new  replacement  component.  Then,  if  a new  component  is  bought  at 
time  t , the  long  run  average  cost  of  replacing  components  Is 

c A(t)  + c 
C(t)  = ^ ^ . 

The  numerator  is  just  the  expected  cost  of  a life  cycle  of  length  t . 

Let  t^  be  the  optimum  replacement  age,  if  it  exists;  i.e., 
c A(t)  + c- 

C(t  ) = Minimum  . 

° t>0 

If  c^  is  interpreted  as  the  average  time  to  repair  the  com- 
ponent and  C2  is  the  average  time  to  replace  with  a new  component, 
then  C(t)  is  the  long  run  average  unavailability  where  an  old  com- 
ponent is  replaced  by  a new  component  at  age  t . 

To  determine  t^  graphically  using  the  scaled  total  time  on 
test  plot  as  in  Figure  4,  first  plot  - c^/c^A(T)  on  the  x-axis  [see 
Figure  5].  (A(T)  is  estimated  from  the  data  as  in  the  previous 


section.)  Construct  a tangent  to  the  scaled  total  time  on  test  plot 
as  in  Figure  5.  The  projection  of  the  tangent  point  on  the  x-axls 
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i 


will  correspond  to  a ratio  of  the  form  i^/N(T)  . The  solution  is 

^ ; i.e.,  t^  = ^ will  minimize  the  average  life  cycle  cost 

o o 

per  unit  time  against  A(t)  estimated  (implicitly)  by  the  scaled 
total  time  on  test  plot.  This  solution  is  completely  analogous  to 
that  of  Bergman  (1977)  for  a different  model. 

The  following  graphical  technique  is  valid  if  all  failure  proc- 
esses are  observed  throughout  [0,T]  and  the  number  of  processes, 
n , is  large.  To  verify  the  above  solution,  let  p^  = 1^/N(T)  and 
note  that  by  construction 


A"^(p^A(T)) 

^o  Cj^A(T) 


^ A~^(pA(T)) 
c^A(T) 


for  0 H P i 1 • Hence 


. 2_  , '-2 

P c A(T)  Po  c A(T) 
i > i 

A~^(pA(T))  ~ A“^(p^A(T)) 


Now  let  t = A ^(pA(T))  and  t = A ^ (p  A(T))  so  that  p = 77^ 

o o A ; 

A (to) 

Pq  “ IW  ■ 


i 


I 

I 


I 


I 


Hence 
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c^A(t)  + ^ ^2 

t — t 

o 

which  Implies  is  the  optimum  replacement  interval.  But 

t = A ^(p  A(T))  ~ Z,r  by  Theorem  2.1  Hence  Z,.  . will 

o o Up  N(,T)J;  (i  ) 

o o 

be  (approximately)  the  optimum  replacement  age. 
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Appendix  1.  Hours  on  Tractor 

and  Engine 

at  the  Time 

of  Failure 

the 

Date  of  Failure 

Hrs.  on 

Hrs.  on 

Date  of 

Tractor 

Engine 

Tractor 

Engine 

Failure 

1 

1 

8230 

8230 

06-16-71 

2 

1 

5085 

5085 

04-16-70 

2 

2 

8501 

3416 

06-24-71 

3 

1 

3826 

3826 

10-11-71 

3 

2 

6983 

3157 

11-10-72 

4 

1 

10950 

10950 

05-08-72 

5 

1 

6052 

6052 

06-01-70 

5 

2 

12040 

5988 

08-21-74 

6 

1 

6367 

6367 

06-07-71 

7 

1 

7774 

7774 

08-10-70 

7 

2 

12035 

4261 

01-11-72 

7 

3 

19520 

7485 

12-28-73 

8 

1 

4394 

4394 

08-08-69 

8 

2 

9415 

5021 

02-24-71 

8 

3 

13069 

3654 

03-08-72 

9 

1 

10517 

10517 

09-21-70 

9 

2 

13783 

3266 

07-12-71 

9 

3 

20970 

7187 

07-11-73 

9 

4 

20988 

18 

08-14-73 

9 

5 

22273 

1285 

03-12-74 

9 

6 

22507 

234 

04-16-74 

10 

1 

2690 

2690 

05-08-67 

10 

2 

6922 

4232 

04-30-70 

10 

3 

10815 

3893 

10-20-71 

10 

4 

12988 

2173 

06-14-72 

10 

5 

17751 

4763 

11-19-73 

10 

6 

19458 

1707 

08-15-74 

11 

1 

6259 

6259 

03-26-68 

11 

2 

9994 

3735 

02-14-72 

12 

1 

5278 

5278 

06-28-65 

12 

2 

6949 

1671 

07-26-66 

12 

3 

9484 

2535 

10-09-67 

12 

4 

14383 

4899 

03-04-71 

13 

1 

6378 

6378 

08-01-66 

13 

2 

11374 

4996 

05-11-72 

13 

3 

12771 

1397 

01-09-73 

14 

1 

6385 

6385 

09-14-66 

14 

2 

11359 

4974 

05-15-70 

19 


Tractor 

Engine 

Hrs.  on 
Tractor 

Hrs.  on 
Engine 

Date  of 
Failure 

15 

1 

6578 

6578 

08-03-66 

15 

2 

7860 

1282 

03-22-67 

15 

3 

9719 

1859 

05-22-68 

16 

1 

5161 

5161 

04-15-65 

16 

2 

6332 

1171 

11-12-65 

16 

3 

11288 

4956 

11-04-69 

16 

4 

16249 

4961 

07-11-72 

16 

5 

18780 

2531 

06-27-73 

17 

1 

6717 

6717 

10-26-66 

18 

1 

6869 

6869 

11-01-67 

18 

2 

8790 

1921 

12-10-68 

18 

3 

13315 

4525 

05-08-71 

19 

1 

5556 

5556 

04-03-67 

19 

2 

6293 

737 

08-09-67 

19 

3 

7679 

1386 

05-18-68 

19 

4 

9931 

2252 

12-20-72 

20 

1 

3268 

3268 

07-30-71 

20 

2 

6091 

2823 

05-31-72 

21 

1 

4815 

4815 

01-21-72 

21 

2 

8388 

3573 

03-12-73 

22 

1 

6150 

6150 

10-31-69 

